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Abstract. We discuss multiscale representations of discrete manifold-valued data. As it 
turns out that we cannot expect general manifold-analogues of biorthogonal wavelets to 
possess perfect reconstruction, we focus our attention on those constructions which are based 
on upscaling operators which are either interpolating or midpoint-interpolating. For definable 
multiscale decompositions we obtain a stability result. 
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1. Introduction 

1.1. The problem area. The correct multiscale representation of manifold- valued data is a 
basic question whenever one wishes to eliminate the arbitrariness in choosing coordinates for 
such data, and to avoid artifacts caused by applying linear methods to the ensuing coordinate 
representations of data. This question appears to have been proposed first by D. Donoho [2j. 
The detailed paper [TTJ describes different constructions, including most of ours, and states 
results inferred from numerical experiments, but without giving proofs. A series of papers, 
starting with [12], has since dealt with the systematic analysis of upscaling operations on 
discrete data - also known under the name subdivision rules - in the case that data live in 
Lie groups, Riemannian manifolds, and other nonlinear geometries. Regarding smoothness 
of limits, a satisfactory solution has been achieved by means of the method of proximity in- 
equalities which also play a role in the present paper. Multiscale decompositions in particular 
have been investigated by [6] (characterizing smoothness by decay of detail coefficients) and 
[8] (stability). 

The present paper studies multiscale decompositions which are analogous to linear biorthog- 
onal wavelets and reviews the known examples based on interpolatory and midpoint-inter- 
polating subdivision rules including the simple Haar wavelets. It turns out, however, that it 
seems unlikely that a rather general way of defining manifold analogues of linear construc- 
tions can have perfect reconstruction, which is the first main result of this paper, even if it 
turns out to be rather vague. For those multiscale decompositions which exist, we show a 
stability theorem which represents the second main result of the paper. We further discuss 
averaging procedures which work in manifolds equipped with an exponential mapping and 
which generalize the well known Riemannian center of mass. This discussion does not contain 
substantial new results, but it is included because we need this construction for the definition 
of nonlinear up- and downscaling rules, as well as for converting continuous data to discrete 
data in the first place. 

1.2. Biorthogonal wavelets revisited. We begin by briefly reviewing the notion of biorthog- 
onal Riesz wavelets, but we are content with the properties relevant for the following sections. 
We start with real-valued sequences a = (aj)igz with finite support which are called filters 
and define the upscaling rule, or subdivision rule associated with the filter a by 

{S a c) k := ^2 l£Z &k-2ici. 

Here c : Z — > V is any sequence with values in a vector space. The transpose of the upscaling 
rule (we skip the definition of transpose) shall be the downscaling rule D associated with the 
filter /3, via 

{D p c) k := ^2 l£Z Pi-2kCi- 

Upscaling and downscaling commutes with the left shift operator (Lc)k = Ck+i in the following 
way: 

S a L = L S a , DpL~ = LDp. 

The most basic rules are defined by the delta sequence: S$ inserts zeros between the elements 
of the original sequence, and D$ deletes every other element. All rules can be expressed in 
terms of Sg, D$, and convolution: 

Ssc = (. . . ,c ,0,ci,0,c 2 , . . . ), D s c = (. ..,c ,c 2 ,c 4 ,...) 
=>■ S a c = (Ssc) * a, Dpc = D$(c * j3). 
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We suppress the indices a, j3 from now on. We assume a further upscaling rule R and a 
downscaling rule Q which shall be high pass filters in contrast to low-pass filters S and dB 
Any sequence which is interpreted as data at level j may be recursively decomposed 
into a low-frequency-part c^' -1 ) (data at level j — 1) and a high-frequency-part d^) (details 
at level j) by letting 

(1) c^'V =Dc®, d ij) = Qc {j) . 

This process can be iterated in order to obtain a pyramid consisting of coarse data and 
wavelet coefficients d^ l \ . . . , d^\ Data at level j shall be be reconstructed by 

(2) c {j) = ScV-V + Rd U) , 

which works precisely if the so-called quadrature mirror filter equation, 

(3) SD + RQ = id, 

holds. It makes sense to require certain further ('biorthogonality') properties like QR = 
id. In particular, high pass downscaling should annihilate everything generated by low pass 
upscaling: 

(4) QS = 0. 

An important consequence of the previous properties is that we can rewrite (pQ) in the form 

(5) c^ 1 ) = Dc^, = Q(c® - ScV-V). 

There are many examples of biorthogonal wavelet decompositions. In the following we give 
some examples. 

1.3. Examples: interpolating and midpoint-interpolating schemes. 

Example 1.1. An upscaling scheme is called interpolating, if it keeps the original data, which 
is expressed by 

(Sc) 2 k = c k «^=> D S S = id. 
For interpolating schemes, downscaling is simply D = D$. Then detail coefficients are the 
difference between data c and the prediction gained via upscaling of Dc. With the left shift 
operator, we can write 

Qc = DL(c- SDc). 

If we define detail coefficients via ([5]), then we can also employ the modified downscaling 
operator 

gmodif = DL 

Reconstruction works via a basic upscaling rule: 

R = L^Ss 

It is easy to check that we have indeed perfect reconstruction. An example is furnished by 
the four-point scheme [1] defined by a^_ 3t 3 y = (—jq, 0, A, 1, ^, 0, —jq)- The action 
c^ -1 ) = Dc^ of the decimation operator is consistent with the interpretation of discrete 
data as samples of a continuous function fit) at the parameter value t = ^k. 



^usually formulated in terms of Fourier transforms. 
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Example 1.2. The Haar scheme is defined by the rules 

S = (L + id)S s , 
which operate as follows: 



S = (L + id)S s , D = \d 5 {L + id), R = (id - L)S S , Q = ^D s (id-L), 



Sc= (... ,C ,C ,Cl,Cl,...), 

Rd= (... ,do, -do,di, -d%, ...), 

Dc=(...,^±^, C -^,...), 
y ' 2 2 



Qc=(. 



c - Cl c 2 - c 3 



2 ' 2 ' y 

Example 1.3. A subdivision scheme S is called midpoint-interpolating, if it is a right inverse 
of the decimation operator D which computes midpoints and which is also used for the Haar 
wavelets of Example 11,21 

DS = id, where Dc =(..., °° ° X , ° 2 ° 3 )■ 

The detail coefficients are the difference between that actual data c and the imputation S*Dc 
found by upscaling the decimated data. Since c — SDc is by construction in the kernel of D 
(i.e., is an alternating sequence), it contains redundant information. We thus complete our 
definitions by letting 

Qc = D 5 {c - SDc) = (..., (c - SDc) , (c - SDc) 2 , ...), 

Rd = (id — L)Ssd = (..., do, —do, d\, —d%, . . . ). 

If we define detail coefficients via ([5]), then a much simpler downscaling operator for details 
can be employed: 

gmodif = Dg 

The action c^^ 1 ' = Dc^ of the decimation rule is consistent with the interpretation of 

discrete data as an average of continuous data over the interval • [k, k + 1]. 

The defining relation implies that any such S can be turned into an interpolating subdivi- 
sion rule S by adding one round of midpoint computation: 

S = ~(L + id)S. 

S is interpolatory, since D S S = \{D 5 L + D s ) = DS = id. The relation S = 2{L + id)- 1 ^ 
leads to a way of finding midpoint-interpolating schemes from interpolatory ones, since it can 
be turned into an effective computation by the use of symbols [5]. For more information on 
that kind of schemes, see e.g. [3]. 

2. BlORTHOGONAL DECOMPOSITIONS FOR MANIFOLD- VALUED DATA 

2.1. Manifold analogues of linear elementary constructions. The main idea to apply 
the previous constructions to manifold- valued data is to find replacements for the elementary 
operations they are composed of. These are the operations — ("vector is difference of points"), 
+ ("point plus vector is a point"), and computing the weighted average of points, which again 
yields a point. As to which kind of data are points and which are vectors, data at level j 
shall be manifold- valued sequences of points, while detail coefficients d^ shall be sequences 
with values in vector spaces associated with the manifold. 
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For data with values in a Lie group G, with associated Lie algebra g, we let 

p © v := pexp(u), q Q p := log(p^ 1 q) G g, 

where exp is the group exponential function and log is its inverse. For matrix groups, we 
have exp(x) = ^2f, >Q x k '/kl as usual (see e.g. (Tj for Lie theory). In a surface or Riemannian 
manifold M, we use the exponential mapping exp p which maps a vector v in the tangent 
space T p M to the endpoint of a geodesic of length which emanates from p with initial 
tangent vector v: 

p © v := exp p (u), q Qp := exp~ 1 (g) G T p M. 

We have thus found analogues © and of the + and — operations, respectively. An average 
with weights of total sum 1 is in Euclidean space equivalently definable by 

(6) m = a j x j < ^ == ' > a j( x j ~ m ) = < ^ = ^ > ^2 a i dist(xj, m) 2 = min . 

The middle definition carries over to both Lie groups and Riemannian manifolds (provided 
to is unique, which it locally is): 

(7) ^aj(ijGm) = 0. 

In Riemannian manifolds, this average is the same as the one defined by the right hand 
condition. These constructions have been employed to define operations on manifold-valued 
data before, in particular subdivision processes. For more details the reader is referred to jH]. 

Another way of redefining averages is by means of an auxiliary base point: In a vector 
space, we have 

Yl a i = 1 =^ Yl a i x i =X + Y1 a i( x i ~ x )' 
for any choice of x. This leads to the definition 

(8) xe^a^-exn 

of manifold average which involves the choice of an additional base point. 

Example 2.1. It is not difficult to see that the weights ao = a\ = \ lead to a symmetric 
average m = fi(xo,x\) = xq © \{x\ © xq) = x\ © \{xq sci), which can be taken as the 
manifold-midpoint of xq and x\. It fulfills the balance condition {x\ m) + [xq to) = 0. 

An obvious generalization, where the averaging process possibly works with a continuum 
of values is defined as follows: Consider a set X which is equipped with some probability 
measure. For instance we could take the unit interval X = [0, 1] with Lebesgue measure. The 
weighted average to of data (f(t)) t <=x with values in a vector space is defined by the following 
equivalent definitions 

(9) m = f(x) / (f(x)—m) = <J=> / dist(/(x), m) 2 = min . 

Jx Jx Jx 

In the case that X is the integers, and the measure means giving each i 6 Z the weight aj, 
then this definition reduces to ©. Also the integral version of the average can be made to 
work for manifold- valued data, by defining m via 

(10) f (f(x) © m) = 0. 

Jx 
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In the Riemannian case, which has been thoroughly discussed by Karcher [9], this is equivalent 
to J x dist(/(x), m) 2 = min. It is then called the Riemannian center of mass (see Section IX. 2 
of [TO]). 

2.2. Manifold versions of niters. We now define nonlinear analogues of the up- and down- 
scaling rules S, D, Q, R. In order to distinguish them from the corresponding nonlinear rules, 
we write the latter as 5j jn , D\ in , Qu n , Rr m . The symbols S,T>, Q,1Z denote nonlinear up- and 
downscaling operators which like the linear ones commute with the left shift operator in the 
following way: 

SL = L 2 S, VL 2 = LV, TZL = L 2 K, QL 2 = LQ. 

We now decompose manifold-valued data 'at level j\ which are denoted by the symbol 
in a manner similar to ([5]): 

(11) C ( j -V = Vc ij) d^ = Q(c (j,) e SVc ij) ) . 

By iteration we arrive at data at the coarsest scale together with a pyramid of detail 
coefficients . . . , d^'. In order to obtain perfect reconstruction via 

(12) c {j) =Sc {j ~ l) ®lld^ 

we impose the following condition on the nonlinear operators which could be interpreted as 
a nonlinear quadrature mirror filter equation: 

(13) SVc (KQ(c Q SVc)) = c for all c. 

2.3. Examples: interpolating and midpoint-interpolating schemes. 

Example 2.2. (manifold version of Example II. 2\i We show how the Haar scheme can be 
made to work in groups and in Riemannian manifolds. With the midpoint fi(p, q) of Example 
Owe let 

Sc = S lin c= (...,c ,c ,ci,ci,...), 

Vc = (.. . ,/i(c ,ci),/i(ci,c 2 ), . . . ) 

while Q = Qii n and 1Z = Ru n . Indeed, c SVc is an alternating sequence of vectors, and the 
detail coefficients associated with data c are given by 

d = QiUc e svc) 

= Qlm{- ■ • ,C 0/i(c O ,Ci),Ci 0/U(co,Ci),C 2 9 //(c 2 , C 3 ), . . . ), 

= (..., c e /u(co, ci), c 2 e a*(c 2 , c 3 ), . . . ). 
It is obvious that with this definition, SVc © IZd = c, so we have perfect reconstruction. 

Example 2.3. (manifold version of Example I l.ljl To find a nonlinear analogue <S of a linear 
upscaling rule defined by affine averages, we can employ geometric averages instead. In this 
way the interpolating scheme Su n = S a can be transferred to the geometric setting, by letting 

(Sc) 2 k = Ck, y~] „ a 2r +i (ck-r Q (<Sc) 2 jfe+i) = 0. 

The remaining rules can be taken from the linear case (using the fact that the simplest rules 
can be applied to any sequence, as its elements do not undergo computations). 

V = D lin = D 5 , Q = Q lin = Q modii = LD S , 11 = R lin = L' 1 S s . 

From the interpolating property of S we see that we have perfect reconstruction. 
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Example 2.4. (manifold version of Example 1 1.3|) In order to make a midpoint-interpolating 
rule Su n work on manifolds, we define an upscaling operator S which retains the crucial 
property that Ck is the midpoint of (Sc) 2 k and {Sc)2k+i- For this purpose we use (JSj) - We 
introduce the following notation for sequences c, v and a point x G M: 

(c e x)k ■= c k © x, (x © v)k ■= x © v k , 

and define 

(Sc) 2k = C k ® {S }in (ce Cfc)) 2 fc, (Sc^fc+l = Cfc © (Sjin(c Cfc)) 2 fc+1. 

It is clear from (c c&)fc = and the midpoint-interpolating property of Sn n , that S is also 
midpoint-interpolating: 

A*((«5c)2fc, («5c) 2 fe+i) = c fc . 
We use the same downscaling operators Q, 2? as in the Haar case of Example 12 .21 which yields 

4 j) = (c^esc^-% k . 

By midpoint interpolation, c^' -1 ' and together determine the original data c»': With 
the geodesic reflection <J x {y) of y in the point x defined by 

(7 x (y) = x © ( - (y © x)) or, locally equivalently, /i(y, cr^y)) = x, 

we have 

ti) _ ( qJj-l)\ m -.0') 



C 2fc 



This construction is already contained in A nonlinear upscaling operator 1Z which effects 
exactly this construction via = c^ -1 ) © IZd^ necessarily depends on the data and may 
be defined by 

(Kd) 2k = d k , (TZd) 2k+1 = a c u- n {{Sc^-% k © 6 Sc® +1 . 

In Riemannian geometry we cannot further simplify this expression. In the case of matrix 
groups, we employ the fact that a x (y) = xy^x and that successive points with indices 
2k, 2 k + 1 of Sc are converted into each other by geodesic reflection in the point c k : 



(Kdh 



log 




log 






-Ad 



«r + i »)"(4- 1) )(^r i) exp< i »)"(^" ) 
- 1 '&«r 1) )-p(-4 )> )(< s 4r 1) )"( c ^ 1) 



Here we have used the notation Ad 9 (w) = y^y 1 . Note that in abelian groups and especially 
in Euclidean space, where y © v = g + v, this formula reduces to lZd 2k+ \ = —TZd 2k . 
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2.4. On the general feasibility of the construction. The examples of geometric and 
nonlinear multiscale decompositions given above are special cases, which are based on in- 
terpolatory subdivision rules, or at midpoint-interpolating rules. It is not clear how perfect 
reconstruction can be achieved in general. We shall presently see that there are some basic 
obstructions which disappear in the linear case. For simplicity we consider only periodic se- 
quences, because then the upscaling and downscaling rules have a finite-dimensional domain 
of definition. 

Prop. 2.5. Smooth rules S,V, Q,1Z can lead to detail coefficients with perfect reconstruction 
for periodic data c G M 2n only if the rank of the mapping c h- > c ST>c equals n ■ dim M, 
which is half the generic rank of such a mapping. 

Proof. Equation (fl"3j) . which expresses perfect reconstruction, is equivalent to 

IZQx = x, where x = c SVc. 

It follows that the mapping c i— > c SVc = 7ZQ(c ST>c) has rank < n ■ dimM, because 
<2, mapping 2n data items to n detail coefficients, has this property. As to the mapping 
c i— > ST>c, its rank does not exceed n • dimM, because V has this property. In case the rank is 
less than n-dimM, the mapping id M 2n : c t-> SVc® (c© SVc) would have rank < 2n-dimM, 
a contradiction. □ 

The condition of rank n ■ dim M which is necessary for perfect reconstruction as mentioned 
in Prop. [231 is unlikely to be satisfied if both upscaling by S and downscaling by V are defined 
via geometric averaging rules derived from linear rules S a and Dp. The following discussion 
of derivatives should make this clear: We have 



and we are interested in the change in (SVc)k if each q undergoes a 1-parameter variation. 
We use the abbreviations <p and tp for the derivatives of with respect to the first and second 
argument, respectively. In the Lie group case, where all tangent vectors are represented 
by elements of the Lie algebra g, both <f> and ip are linear endomorphisms of q. In case of 
Riemannian manifolds, where : M x M — > TM, both (f),ip map to T P Q q {TM). As the next 
formula shows it is not necessary to look closer at this abstract tangent space, because we 
always combine ip~ l with <fi and the image of 4> occurs only implicitly. Differentiation of (|14p 
implies that 



The precise form of this equation is not relevant, but by observing that the differentials of © 
have to be evaluated at many more independent locations than the desired rank n ■ dimM 
would suggest, it is clear that only very special filters can lead to rank n ■ dimM. The 
situation in the linear case is different: The differentials of are constant, and the condition 
that the previous formula defines a mapping of rank n is an algebraic condition involving the 
coefficients of filters a, (3. 



(14) 





and further 




)) 
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Similar considerations show that also the so-called log-exponential construction, where a 
nonlinear rule is constructed via ([8]) (see Ex. 12. 4ft do not in general yield the rank condition 
expressed by Prop. 12.51 

3. Stability analysis 
The point of going through the trouble of decomposing a signal is that one expects many 

(k) 

detail coefficients cL to be small and therefore to be negligible. This is the basis of thresh- 
olding in order to compress data, which makes sense only if one can control the change in 
reconstructed data if we change the detail coefficients by resetting some of them to zero. 
Similarly, quantizing data will result in deviation from the original. Again, it is important to 
control that change. It is the purpose of this section to establish a stability result for nonlinear 
rules which applies to such situations. 

3.1. Coordinate representations of nonlinear rules. For the stability analysis we trans- 
fer all manifold operations to a local coordinate chart. This is justified only if the construc- 
tions we are going to analyze are local. The linear upscaling and downscaling rules defined 
previously have this property, and so have the nonlinear ones mentioned in the examples 
above. 

The operators ©, are replaced by their respective coordinate representations, which are 
denoted by the same symbols and which are defined in open subsets of suitable coordinate 
vector space: We assume that © maps from FxW into V, and maps from V x V into W. 
Besides smoothness they are assumed to fulfill the compatibility condition 

(15) p@{qQp)=q. 

We further assume that 0, are Lipschitz functions, i.e., there exist constants A,B with 

(16) A\\p — q\\ < \\p <jf|| < B\\p — q\\. 

Locally this is always the case. Our analysis of stability requires that the operators S, V, Q, 1Z 
(we do not introduce new symbols for their coordinate representations) fulfill some reasonable 
assumptions which are listed below. Notation makes use of the symbol "<" which means that 
there is a uniform constant such that the left hand side is less than or equal to that constant 
times the right hand side. For a sequence w = (u>i)iez we use the notation \\w\\ := sup igZ \\wi\\. 

• Boundedness of Q, 1Z: The mappings Q, 1Z operate on VF-valued sequences w, which 
are generated as the difference of point sequences. They are supposed to satisfy || Qw\\, 
II^HI ~ \\ w \\> with respect to some norm W is equipped with. 

• Reproduction of constants: For constant data we require that Sc = c and T>c = c. 

• Each of S, 7Z, T>, Q shall be as smooth as is needed (in general a little more than C 1 
will suffice). 

• First-order linearity of S,T> on constant data: For constant sequences we require that 
(17) dS\ c = S lin , dV\ c = D liD 

for some low-pass upscaling and downscaling operators Su n , Dy m operating on V-val- 
ued sequences, and where Su n is a convergent subdivision rule. The only exception 
shall be Haar case, where Su n = S$ shall be the splitting rule (see Ex. Il.ip . This 
condition is natural when one considers 5,P as geometric analogues of linear con- 
structions which are defined by replacing affine averages by geometric averages, or by 
replacing the + and — operations by and ©. 
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3.2. Stability Results. The aim of this section is to prove the following stability theorem: 

Theorem 3.1. Suppose that S,T>, Q,7Z are upscaling and downscaling operators which fulfill 
the nonlinear version (|13p of the quadrature mirror filter equation, and which also fulfill the 
technical conditions listed above. Consider a data pyramid {c^')j>o with c^'" 1 ' = T>c^> which 
enjoys the weak contractivity property 

(18) l|Ac,||<^ Gu<l). 

Then the reconstruction procedure of data c^'J at level j from coarse data c^ and details 
d^\ . . . , d^' is stable in the sense that there are constants D, E\, E2 such that for all j and 
any further data pyramid c^' with details we have 

(19) || C (°> -c i0) \\ < E x , \\d (k) -d ik) \\<E 2 ^ k for all k 

(20) \\ C M-cW\\ < D[\\c® -c(°)|| +J2l =1 \\d {k) -dM\\). 

The assumption of decay given by (fT8|) is fulfilled for any finite data pyramid (simply adjust 
the constant which is implied by using the symbol "<"). 



3.3. Proofs. The remaining part of this section is devoted to the proof of this statement. 
Our arguments closely follow the ones in |8] which will enable us to occasionally skip over 
some purely technical details and focus on the main ideas. 

The crux is to show that the differentials of the reconstruction mappings are uniformly 
bounded. We shall go about this task by using perturbation arguments. The justification of 
this approach lies in the fact that by our assumptions the nonlinear reconstruction procedure 
agrees with a linear one up to first order on constant data. Indeed, our assumptions already 
imply that S satisfies a proximity condition with Sn n in the sense of [12] : 

Lemma 3.2. With the above assumptions we have the inequalities 

(21) \\Sc - S lin c\\ < \\Acf, \\Vc - D liD c\\ < \\Acf. 

Proof We use a first order Taylor expansion of S. For any constant sequence e we have 
Su n e = Se = e, so 

Sc = Se + dS\ e {c -e) + 0{\\c - e|| 2 ) 

= e + S lin (c - e) + 0(\\c - e|| 2 ) = S lin c + 0(\\c - e|| 2 ). 

Since S and Su n are local operators we may choose e such that 

ll c - e ll ^ ll Ac ll- 

This proves the first equation. The proof of the second one is the same. □ 

We now show that for all initial data c^' with exponential decay of HAc^) ||, the associated 
detail coefficients experience the same type of decay. 

Lemma 3.3. Assume that (I18D holds for (c^)j>o. Then 



(22) 
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Proof. We use the boundedness of Q and Lemma[32]to estimate the norm of detail coefficients: 
\\ d U)\\ = ||g c 0) e<Sc 0-i)|| < \\c {j) e5c (j ~ 1} || < \\c ij) -Sc^-^w 

< \\c ij) - S lin Vc^\\ + \\S lin cV-V _ 5 C (J-1)|| 

< \\c® - S liD D lin c^\\ + \\S lin (Vc^ - D lin c^)\\ + \\S lin c^ - Sc^\\ 

< \\ C ^ -S h nD lin C^\\+^. 

It remains to estimate ||c^ — Su n Dn n c^\\. Reproduction of constants implies that for any 
constant sequence e, 

\\c® - S Un D lin c^\\ = ||cW - e + StoDunic® - e)|| < ||c^ - e||. 

By the locality of S]i n and L>jj n we can pick e such that ||c^ — e|| < ||Ac^||. This concludes 
the proof. □ 

For later use we record the following two facts. The first one is a perturbation theorem 
which has been shown in |12j . 

Theorem 3.4. Assume that Su n is a convergent linear subdivision scheme and that S satisfies 
dS\ c = Su n for all constant data c. Then there exists fi < 1 such that 

(23) \\AS^c\\<^ 

for all initial data c with ||Ac|| small enough. 

We do not want to go into details concerning the precise meaning of 'small enough'. The 
reader who is interested in the considerable technical subtleties arising from this restriction 
and also the fact that S is usually not globally defined is referred to our previous work El [6] 
where these issues are rigorously taken into account and the appropriate bounds for ||Ac|| are 
derived. 

The second result is also a perturbation result which has been shown in [8]. 

Lemma 3.5. Let Ai, U{ be operators on a normed vector space. Assume exponential decay 
;$ /A for some fi < 1. Then uniform boundedness of \\A\ • • • Ak\\ implies uniform 
boundedness of \\(Ai + U\) ■ ■ ■ {A^ + J7fc)||. 

We continue with the proof of Theorem 13.11 by showing that the decay property (|18p we 
assumed for the data pyramid c^' also holds for the perturbed data pyramid cV\ 

Lemma 3.6. Under the assumptions of Theorem \3.1\ further assume that Su n is a convergent 
subdivision scheme. Then there exist constants si, S2 such that for all j, and any choice of 
data c"w we have 

(24) ||AH (0) || < si, \\d (k) \\ < s 2 n k for all k \\Ac®\\ < (/x + e) 1 . 
Here for each e > the implied constant is uniform. 

Proof. (Sketch) We make the simplifying assumption that for all initial data c which occur in 
the course of the proof we have 

(25) ||A<Sc|| < mI|Ac||. 

This is no big restriction as it can be shown that such an equation always holds for some 
iterate S N of S and initial data with ||Ac|| small enough, provided Su n is convergent |12j . In 
case that only 

||A«Sc|| < £||Ac|| 
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for some fx G (//, 1) we make the initial fi larger. This does not change the substance of 
Theorem l3.11 With the Lipschitz constants r, r' defined by \\1Zc\\ < r\\c\\, \\a(Bb—a\\ <r'\\a®b\\ 
we now estimate: 

||Ac«|| < ||A5c(°)|| +2||(S^°) e^ 1 )) -Sc®\\ 

< /u||Ac (0) || +2r'||7&y < fis 1 + 2rr's 2 fi 

Iteration of this argument gives the inequality 

|| Ac^ || < s lt i n + 2nrr's 2 Li n < (m + e) n 

for all e > 0, which we wanted to show. In case ()25[) does not hold for S, but only for 
an iterate S N , a similar argument is required which we would like to skip. The reason for 
requiring s±, s 2 to be 'small enough' is that (1251) usually only holds for data c in some set 



Pm,5 ■= {c \c k E M Vfc, and ||Ac|| < 5}. 

In general we need to ensure that all c^'s lie in the set Pms if the only information on the 
data is the size of detail coefficients. This rather technical step is where we the restrictions 
on the constants s\,S2 come in. We chose to skip the technical details regarding this issue, 
since we do not find them particularly enlightening and they have already been treated in full 
detail in previous work [8j El [7] . □ 

We are finally in a position to prove Theorem 13.11 

Proof (of Theorem \3. The mapping which computes data at level k by way of recon- 
struction is denoted by Pt- We use the following notation and definition: 

(26) Xj := (c (0) ,d (1) ,...,d (fc) ) e£(V) x£(W) k 

(27) P k {X k ) := SP k ^(X k ^) nd^\ P = id. 

We first treat the case that Su n is a convergent subdivision scheme and later deal with the 
Haar case. 

Observe that we can without loss of generality assume that both ||Ac^|| and the implied 
constant in (I22h are arbitrarily small. This is because we can simply do a re-indexing (c') W = 
c (*+io) anc [ we assumed exponential decay of Ac^\ in particular, 

||A C (°)|| </i<si, \\d^\\ </ 2/ A f 2 <s 2 , 

with the constants s\,s 2 from Lemma 13.61 By Lemma 13.31 is likewise of exponential 

decay. By the same argument we can make the implied constant arbitrarily small. 

Pick the constants E±,E 2 such that f\ + E\ < s± and f 2 + E 2 < s 2 , and consider coarse 
data c(°) and detail coefficients S l \ . . . , d^' which obey the assumption (|19p made in the 
statement of the theorem. Lemma 13.61 implies that we have exponential decay of HAc^H. 

The estimates gathered so far enable us to show that there exists a constant C such that 
for all j, k and all perturbed arguments 

^ = (^),d«,...,^)), 

we have the bound 

d _ d 
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Indeed, using the chain rule on the recursive definition (|27p . we see that 

d 



(29) 



dc (0) 



P, - [di®\ (SP iMU))) [dS\~ 



P 



3-1 



Our assumptions on smoothness (here: © is C 2 ) and the compatibility relation (fT5|) together 
imply that 



with ||Vj|| < ||7£d^|| < In order to estimate the term dS\~ u _ 1): we note that dS = Su n 
implies that \\dS\ — SunW < ||Ac|| for all initial data c, see [8]. Hence we can write 

dS\~ u _ 1} = S lin + Wj, where ||T^|| < \\c^ \\ < (ji + ef 

for any e > 0. It is a well known fact that for a convergent subdivision scheme Sn n , there is a 
constant M with sup,,- \\Sj in \\ < M. The previous discussion and iterative application of ([29]) 
implies 

d 

Now we invoke Lemma 13.51 and see that indeed the partial derivatives of Pfc with respect to 
c(°) at Xj are uniformly bounded, independent of j. The derivatives with respect to d k can 
be handled in an analogous manner. This shows (I28p . from which it is easy to see (I20p . 

Having concluded the proof in the case that Sn n is a convergent subdivision scheme, we 
turn to the Haar case. It is analogous, but because we have S = Sn n we do not need the 
perturbation inequalities at all to estimate differentials (in particular we do not need Lemma 
EOfll. □ 



_ Pj = {Snn + U x ) ■ ■ ■ {Sun + Uj), where \\U k \\ < (ji + e) k 

X 3 



Remark 3.7. The only place where the constants E\,Ei come into play is the assumption 
(l25j) which is usually only satisfied for data in some set Pm,s ~ see the discussion in the proof 
of Lemma 13.61 It is easy to see that if S is defined and contractive for all initial data, then 
the constants E±, E2 can be arbitrarily large. 



4. Obtaining discrete data 

4.1. Convolution and smoothing of manifold- valued data. Here we are going to in- 
vestigate further properties of the geometric average which was defined by Equations ([7]) 
and (|10p . They will become important in Section [4.2i This material is already contained in 
Karcher's paper [9j as far as surfaces and Riemannian geometry are concerned. Here we also 
show the extension to Lie groups, which is not difficult once the Riemannian case is known. 

Convolution with a function tjj with J ip = 1 can be interpreted as an average. This applies 
to multivariate functions as well as to univariate ones, which are our main concern. In order 
to fit the previous definitions, we give an equivalent construction of the convolution g *ip for 
vector- valued functions g, and at the same time a definition of (/ ®ijj)(u) for manifold- valued 
functions / : R d ->• M. 

(30) m = (g*ip)(u) <^=> m = L d g(x)ip(u — x) dx ■<=>■ J„ d (g(x) — m)ip(u — x) dx = 0, 

(31) m = (f ® ip)(u) f Rd (f Q m)ip(u - x)dx = 0. 
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The even more general case where the domain of functions are manifolds has been discussed 
in [9]. It turns out that basically any nonnegative kernel function i\) supported in the cube 
[—1, l] d can be used for smoothing in the following way: For each p > 0, we let 

(32) fP = f ® where V p (x) = ^U(- 

p a \p 

We want to show that / and its differential df are approximated by f p and df p as p approaches 
zero. The proofs consist of revisiting the proofs given in [9] which apply to the Riemannian 
case. 

Theorem 4.1. Consider the smoothed functions f p defined by a function f : M. d — > M and a 
kernel ifi as above. Then 

lim p ^o f p = f, hnip^o df p = df. 
In case f is Lipschitz differentiable, then this convergence is linear. 

Proof. We skip convergence of f p and show only convergence of df p . The proof is in the spirit 
of Lemma 4.2 and Theorem 4.4 of [9], the difference being that the domain of / is a vector 
space. We define V : R d x M -» R dimM by letting 

V(u,p) := f(f(x)Qp)ip p (u — x)dx. 

By definition, V(u, f p {u)) = 0. This implies the following equation of derivatives: 

(33) diK,/p(u) + D 2 V uJP{u) o df p = 0. 

The capital D indicates the fact that in the Riemannian case we employ a covariant derivative. 
The partial derivatives of V have the form 

Mu, P V («) = J t \ t =o f (f^ e P)^ P ( u ^ ~ v) d V 

A 

f (f(x — u + u(t)) Q p)ip p (u — x)dx, 

D ^\u,p V (P) =J tt =J ^ {X) G P ^ P ( U ~ X ^ X 

Using the functions E Piq (q) = ~^{pQ q{t)) and F Ptq (p) = ^(p(t) Qq), we get 

(34) d x \ u>p V{u) + D 2 \ u V(p) = J (F f(xlp (df x (u)) - E f{x)>p (p))r(u - x)dx. 

It is shown in [9] that in the Riemannian case the functions E Ptq and F Pt q can be bounded in 
terms of sectional curvature K, and the parallel transport operator Pt£° om : 

E p , q (v) = v + R, F Pt9 (y) = Pt q p (v) + R' , 

where < || v\\ const (minK, max K) ■ dist(p,g) 2 and < \\F p q (v)\\ const (max | K |) ■ 

dist(p, q) 2 . Letting p = f p {u) and p = df p (ii), we convert (J33|) and (f34T) into the integral 

Pt^° df x {u) + R\x) - df p (u) - R(x))r(u - x)dx, 

without indicating the dependence of the remainder terms R, R' on x. The assumption that 
/ is C 1 implies that for all x with contribute to the integral (i.e., ip p (u — x) ^ 0), we have 
x -» u, df x (u) -» df u (u), f p {x) -4 f(u), Pt ->■ id, R -> 0, R' -4 0. Observe that all these 
limits have at least linear convergence rate, provided df is Lipschitz. With f ijj = 1, we obtain 

\im p ^{df x + df p ){u) -4 0, 
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where the limit is linear if df is Lipschitz. This concludes the proof in the Riemannian case. 

In the Lie group case, it is not difficult the compute the derivatives E p ^ q (p) and F p>q (q) by 
means of the Baker-Campbell-Hausdorff formula which says log(e x e y ) = x + y + ^[x,y] + ■ ■ ■ , 
where the dots indicate terms of third and higher order expressible by Lie brackets. When 
p and q = pe z undergo 1-parameter variations of the form p(t) = pe tw and q(t) = qe tw with 
w € 0, then 

pit) Qq = \og(e z e tw ) = z + tw + - [z, tw] + . . . 

p e q(t) = log(e' tw e z ) = -tw + z+ -[-tw, z] + . . . 

This implies 

F p , q (w) = w + ~[z,w] H , 

E p , q (w) = w + - [w,z] H 

Similar to the Riemannian case above, we convert (|33[) and (|34j) into the integral 

dfx(u) + ~ [f p (u) 9 f(x),df x (u) + dfP(uj\ - dfP(u) + ■ ■ -)V{u - x) dx = 

in the Lie algebra. The same arguments imply x — > u, f p (x) — > f(u), df x (u) — > df u (u), and as 
a consequence df p — >• df as p — > 0. This concludes the proof of Theorem 14.11 in the Lie group 
case. □ 

4.2. The passage from continuous to discrete data. In the analysis of multiscale de- 
compositions one frequently assumes an infinite detail pyramid. In practice a vector- valued or 
manifold- valued function f(t) which depends on a parameter t £ K is given be finitely many 
measurements. Such measurements might be samples at parameters ti = ih, for some small h; 
or measurements might be modeled as averages of the form / © <p{- — ih)i^z where (j) is some 
kernel with J <ft = 1 and supp(^) small (in fact physics excludes the kind of measurement we 
called samples and permits only (f> to approach the Dirac delta). 

In the linear case any multiscale decomposition based on midpoint-interpolation and espe- 
cially the Haar scheme are well adapted to deal with averages: The decimation operator D 
in this case is consistent with the definition of discrete data as follows: 



cO'" 1 ) = Dc ij) . 



We have no analogous relation for manifold-valued multiscale decompositions. Nevertheless 
we may let 

In view of Theorem 14. 11 this yields discrete data whose discrete derivatives Ac^^ approximate 
the derivatives of /. Assuming / to be C 2 , we have 

Acf := - j?) =► *cjp = + 0(2-) = j f f\ k2 . 3 + 0(2-'). 

The previous equation is to be interpreted in any smooth coordinate chart of the manifold 
under consideration. 
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